Lab #1, Winter 2020: Wrangling and Maps (of San Francisco, no less!)

First, I’ll just read in the map of SF streets

sf_trees <- read_csv(here("data", "sf_trees", "sf_trees.csv"))
## 
## ── Column specification ────────────────────────────────────────────────────────
## cols(
##   tree_id = col_double(),
##   legal_status = col_character(),
##   species = col_character(),
##   address = col_character(),
##   site_order = col_double(),
##   site_info = col_character(),
##   caretaker = col_character(),
##   date = col_date(format = ""),
##   dbh = col_double(),
##   plot_size = col_character(),
##   latitude = col_double(),
##   longitude = col_double()
## )

Basic Wrangling Reminders

I’m refreshing some skills for wrangling and summary statistics. this is from ‘dplyr’ package

Let’s find the Top 5 highest observations of trees, by legal status. Then I’ll wrangle. Then I’ll make a graph.

top_5_status <- sf_trees %>% 
  count(legal_status) %>% 
  drop_na(legal_status) %>% 
  rename(tree_count = n) %>% 
  ### Select would work to select or remove columns, we'll use relocate! Relocate can move around any columns or variables. We put jsut 1 variable name, which moved that variable to the "front" (the left)
relocate(tree_count) %>% 
  slice_max(tree_count, n = 5) 

Make a graph of the top 5 observations, by legal status. This is using the wrangled version of sf_trees, called top_5_status

ggplot(data = top_5_status, aes (x = fct_reorder(legal_status, tree_count), y = tree_count)) +
  geom_col() + 
  labs( x = "Legal Status", y = "Tree Count", title = "SF Trees") +
  coord_flip() +
  theme_minimal()

A few more Wrangling examples. I only want to keep observations (rows) of blackwood acacia trees.

blackwood_acacia <-  sf_trees %>% 
filter(str_detect(species, "Blackwood Acacia")) %>% 
  select(legal_status, date, latitude, longitude)

Making a faux-map, based on Latitude and Longitude. BUT R does not know that Lat/Long are spatial coordinates.

ggplot(data = blackwood_acacia, aes( x = longitude, y = latitude)) +
  geom_point( size = 0.5)
## Warning: Removed 27 rows containing missing values (geom_point).

Using tidyr::separate() and unite() functions.

Useful for combining or separating columns.

sf_trees_sep <- sf_trees %>% 
  separate(species, into = c("spp_scientific", "spp_common"), sep = "::")

Example: tidyr::unite to unite tree and legal status (weird)

sf_tree_united <- sf_trees %>% 
  unite("id_status", tree_id:legal_status, sep = "_cool!_")

Make an actual map of Blackwood Acacia in SF. This is important because spatial data isn’t shown well within ggplot

‘st_as_sf()’ to convert latitude and longitude to spatial coordinates.

blackwood_acacia_sp <- blackwood_acacia %>% 
  drop_na(longitude, latitude) %>% 
  st_as_sf(coords = c("longitude", "latitude"))

st_crs(blackwood_acacia_sp) = 4326

Making a plot with spatial data (geom_sf) instead of geom_point.

ggplot(data = blackwood_acacia_sp) +
  geom_sf(size = 0.4, 
          color = "goldenrod")

Read in SF roads shape file:

sf_map <- read_sf(here("data", "sf_map", "tl_2017_06075_roads.shp"))

st_transform(sf_map, 4326)
## Simple feature collection with 4087 features and 4 fields
## geometry type:  LINESTRING
## dimension:      XY
## bbox:           xmin: -122.5136 ymin: 37.70813 xmax: -122.3496 ymax: 37.83213
## geographic CRS: WGS 84
## # A tibble: 4,087 x 5
##    LINEARID   FULLNAME     RTTYP MTFCC                                  geometry
##  * <chr>      <chr>        <chr> <chr>                          <LINESTRING [°]>
##  1 110498938… Hwy 101 S O… M     S1400 (-122.4041 37.74842, -122.404 37.7483, -…
##  2 110498937… Hwy 101 N o… M     S1400 (-122.4744 37.80691, -122.4746 37.80684,…
##  3 110366022… Ludlow Aly … M     S1780 (-122.4596 37.73853, -122.4596 37.73845,…
##  4 110608181… Mission Bay… M     S1400 (-122.3946 37.77082, -122.3929 37.77092,…
##  5 110366689… 25th Ave N   M     S1400 (-122.4858 37.78953, -122.4855 37.78935,…
##  6 110368970… Willard N    M     S1400 (-122.457 37.77817, -122.457 37.77812, -…
##  7 110368970… 25th Ave N   M     S1400 (-122.4858 37.78953, -122.4858 37.78952,…
##  8 110498933… Avenue N     M     S1400 (-122.3643 37.81947, -122.3638 37.82064,…
##  9 110368970… 25th Ave N   M     S1400  (-122.4854 37.78983, -122.4858 37.78953)
## 10 110367749… Mission Bay… M     S1400 (-122.3865 37.77086, -122.3878 37.77076,…
## # … with 4,077 more rows
ggplot(data = sf_map) +
  geom_sf(size = 0.3)

Combine Blackwood acacia plot with the SF roads map;

ggplot() +
  geom_sf(data = sf_map, 
          size = 0.1, 
          color = "darkgray") +
  geom_sf(data = blackwood_acacia_sp,
          color = "red", 
          size = 0.3) +
  theme_void()

Now to create an interactive map:

tmap_mode("view")
## tmap mode set to interactive viewing
tm_shape(blackwood_acacia_sp) + 
  tm_dots()